Model Selection

Lightweight inference

# Lightweight inference

Baidu ERNIE 4.5 0.3B PT GGUF

A quantized version based on the Baidu ERNIE-4.5-0.3B-PT model, optimized through the llama.cpp tool to reduce the model size and improve the running efficiency.

Large Language Model Supports Multiple Languages

Echelon AI Med Qwen2 7B GGUF

This project provides the GGUF quantized file for the Echelon-AI/Med-Qwen2-7B model, supported by Featherless AI, aiming to enhance model performance and reduce operating costs.

Large Language Model

featherless-ai-quants

Devstral Small 2505 3bit

This is a 3-bit quantized version converted from the mistralai/Devstral-Small-2505 model, designed for the MLX framework and supports multilingual text generation tasks.

Large Language Model Supports Multiple Languages

Qwen3 0.6B GGUF

Qwen3 is the latest version of the Tongyi Qianwen series of large language models, offering a range of dense and Mixture of Experts (MoE) models. Based on large-scale training, Qwen3 has achieved breakthrough progress in reasoning capabilities, instruction following, agent functionalities, and multilingual support.

Large Language Model English

Kimi VL A3B Thinking 6bit

Kimi-VL-A3B-Thinking-6bit is a multilingual vision-language model converted based on the MLX format, supporting image-text to text tasks.

Transformers Other

3b Ko Ft Research Release Q4 K M GGUF

This is a 3B-parameter language model optimized for Korean, converted to GGUF format for compatibility with llama.cpp.

Large Language Model Korean

Llama 4 Scout 17B 16E Instruct GGUF

Llama-4-Scout-17B-16E-Instruct-GGUF is a quantized version based on the Llama-4-Scout-17B-16E-Instruct model, supporting multilingual processing and suitable for chat and instruction tasks.

Large Language Model

Transformers Supports Multiple Languages

Image depth estimation model based on Latent Bridging Matching (LBM) technology, achieving rapid image transformation through latent space bridging

Zhaav Gemma3 4B

A Persian-specific model fine-tuned based on the Gemma 3 architecture, utilizing QLoRA 4-bit quantization technology, suitable for running on ordinary hardware.

Large Language Model Other

Mistral Small 3.1 24b Instruct 2503 Hf GGUF

This is a GGUF format quantized version of the mrfakename/mistral-small-3.1-24b-instruct-2503-hf model, suitable for text generation tasks.

Large Language Model

Gemma 3 4b Pt Q4 0 GGUF

This is a GGUF format model converted from Google's Gemma 3.4B parameter model, suitable for text generation tasks.

Large Language Model

Gemma 3 4b It GGUF

Gemma 3.4B IT is a lightweight open-source large language model released by Google. Based on a parameter scale of 3.4B, it is suitable for dialogue and instruction following tasks.

Large Language Model

Phi 4 Multimodal Instruct

Phi-4-multimodal-instruct is a lightweight open-source multimodal foundation model that integrates language, vision, and speech research data from Phi-3.5 and 4.0 models. It supports text, image, and audio inputs to generate text outputs, with a context length of 128K tokens.

Transformers Supports Multiple Languages

Selene 1 Mini Llama 3.1 8B Q6 K GGUF

GGUF format model converted from AtlaAI/Selene-1-Mini-Llama-3.1-8B, suitable for text generation tasks and supports multiple European languages.

Large Language Model Supports Multiple Languages

USER Bge M3 Q8 0 GGUF

This model is converted from deepvk/USER-bge-m3 into GGUF format, primarily used for sentence similarity and feature extraction tasks.

Text Embedding Other

Flan T5 Base Q4 K M GGUF

This model is a GGUF format version converted from Google's flan-t5-base model, supporting multiple languages and tasks, suitable for text generation and reasoning tasks.

Large Language Model Supports Multiple Languages

Gemma 2 Baku 2b It

An instruction fine-tuned model based on Gemma 2 Baku 2B, which optimizes the instruction following ability and is suitable for natural language processing tasks.

Large Language Model

Transformers Japanese

Llama 3.2 1B Instruct Q8 0 GGUF

This is Meta's 1 billion parameter instruction-tuned model from the Llama 3.2 series, converted to GGUF format for use with llama.cpp

Large Language Model Supports Multiple Languages

Llm Jp 3 1.8b Instruct

A large language model developed by the National Institute of Informatics in Japan, supporting Japanese and English, with instruction fine-tuning capabilities.

Large Language Model

Transformers Supports Multiple Languages

USER Bge M3 Q4 K M GGUF

This model is converted from deepvk/USER-bge-m3 to GGUF format, primarily used for sentence similarity calculation and feature extraction.

Text Embedding Other

Robbert 2022 Dutch Sentence Transformers Onnx

ONNX version of the Dutch Forensic Institute's RobBERT-2022 Dutch sentence embedding model, optimized for high speed and lightweight performance.

Faster Whisper Large V2

This is the CTranslate2 converted version of OpenAI Whisper large-v2 model for efficient speech recognition

Speech Recognition Supports Multiple Languages

A model based on sentence-transformers for determining the relevance between short texts and questions.

Transformers Other

Resnet Tiny Beans

An ultra-small model trained on a bean dataset, primarily for testing and demonstration purposes.

Large Language Model

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase